Accelerated Variational Infinite Mixture Models
نویسندگان
چکیده
Infinite mixture models, such as the Dirichlet process mixture, are promising candidates for clustering applications where the number of clusters is unknown a priori. Due to computational considerations these models are unfortunately unsuitable for large scale data-mining applications. We propose a class of deterministic accelerated infinite mixture models that can routinely handle millions of data-cases. The speedup is achieved by incorporating kd-trees into a variational Bayesian algorithm for infinite mixture models in the stick breaking representation. Besides kd-trees, this algorithm is also different from Blei and Jordan (2005) in the way we handle truncation: we only assume that the variational distributions are fixed at their priors after a certain truncation level. Experiments show that speedups relative to the standard variational algorithm can be significant.
منابع مشابه
Accelerated Variational Dirichlet Process Mixtures
Dirichlet Process (DP) mixture models are promising candidates for clustering applications where the number of clusters is unknown a priori. Due to computational considerations these models are unfortunately unsuitable for large scale data-mining applications. We propose a class of deterministic accelerated DP mixture models that can routinely handle millions of data-cases. The speedup is achie...
متن کاملVisual Scenes Clustering Using Variational Incremental Learning of Infinite Generalized Dirichlet Mixture Models
In this paper, we develop a clustering approach based on variational incremental learning of a Dirichlet process of generalized Dirichlet (GD) distributions. Our approach is built on nonparametric Bayesian analysis where the determination of the complexity of the mixture model (i.e. the number of components) is sidestepped by assuming an infinite number of mixture components. By leveraging an i...
متن کاملInfinite models for speaker clustering
In this paper we propose the use of infinite models for the clustering of speakers. Speaker segmentation is obtained trough a Dirichlet Process Mixture (DPM) model which can be interpreted as a flexible model with an infinite a priori number of components. Learning is based on a Variational Bayesian approximation of the infinite sequence. DPM model is compared with fixed prior systems learned b...
متن کاملTruncation-free Hybrid Inference for DPMM
Dirichlet process mixture models (DPMM) are a cornerstone of Bayesian nonparametrics. While these models free from choosing the number of components a-priori, computationally attractive variational inference often reintroduces the need to do so, via a truncation on the variational distribution. In this paper we present a truncation-free hybrid inference for DPMM, combining the advantages of sam...
متن کاملMemoized Online Variational Inference for Dirichlet Process Mixture Models
Variational inference algorithms provide the most effective framework for largescale training of Bayesian nonparametric models. Stochastic online approaches are promising, but are sensitive to the chosen learning rate and often converge to poor local optima. We present a new algorithm, memoized online variational inference, which scales to very large (yet finite) datasets while avoiding the com...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006